Joint multi-ancestry and admixed GWAS reveals the complex genetics behind human cranial vault shape

The cranial vault in humans is highly variable, clinically relevant, and heritable, yet its genetic architecture remains poorly understood. Here, we conduct a joint multi-ancestry and admixed multivariate genome-wide association study on 3D cranial vault shape extracted from magnetic resonance images of 6772 children from the ABCD study cohort yielding 30 genome-wide significant loci. Follow-up analyses indicate that these loci overlap with genomic risk loci for sagittal craniosynostosis, show elevated activity cranial neural crest cells, are enriched for processes related to skeletal development, and are shared with the face and brain. We present supporting evidence of regional localization for several of the identified genes based on expression patterns in the cranial vault bones of E15.5 mice. Overall, our study provides a comprehensive overview of the genetics underlying normal-range cranial vault shape and its relevance for understanding modern human craniofacial diversity and the etiology of congenital malformations.


Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of all covariates tested
A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g.means) or other basic estimates (e.g.regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g.confidence intervals) For null hypothesis testing, the test statistic (e.g.F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.

April 2023
Field-specific reporting Please select the one below that is the best fit for your research.If you are not sure, read the appropriate sections before making your selection.

Life sciences
Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
No statistical method was used to predetermine sample size for any analysis performed in this work.Sample sizes were determined to be sufficient based on results of previous GWAS of related phenotypes (e.g., cranial vault dimensions, facial shape) with similar sample sizes.Sample size was maximized based on data availability in the ABCD data repository and UK Biobank, after excluding samples that failed image processing, were outliers with respect to covariates.The number of CHiPseq datasets was deemed sufficient based on previous use of these datasets in White et al., 2021.The number RNAseq replicates was deemed sufficient based on differential expression analyses performed in other works.
Data exclusions MRI scans that failed QC at any point in the pipeline were excluded as described in the Methods, as well as participants with extreme or missing covariate values or whose recent ancestry could not be adequately modeled with the three ancestry components (European, African, and Indigenous American).These measures were determined prior to performing any GWAS analysis.

Replication
To measure the presence of the associated shape trait from the discovery panel (ABCD), the replication panel (UK Biobank) was projected onto the latent shape trait, identified by canonical correlation analysis, for a particular SNP in a particular cranial vault segment.The resulting univariate scores were calculated for each lead SNP/segment pair (n = 108) for which significant (P < 5e-8) associations were found.At 5% FDR, 55/108 tests were significant, and 20/30 SNPs significantly replicated in at least one cranial vault segment.Given the damage in the replication scans, this replication rate is conservative but within the expected range for GWAS.Future GWAS studies will likely replicate more of the signals identified in this study.We did not perform any other replication experiments.
Randomization MRI images from the ABCD study and UK Biobank datasets were assigned into groups based on SNP genotypes.Images from the ABCD study were adjusted for sex, age, height, weight, cranial size, 10 principal components representing global ancestry components, and African and European local ancestry components.Images from the UK Biobank were adjusted for sex, age, age squared, height, weight, cranial size, and 10 principal components representing global ancestry components.Testing of variants for risk of craniosynostosis was performed using a TDT test without adjustment for covariates since the effects of covariates on penetrance are the same for both alleles transmitted to the child.Parietal and Frontal bones were dissected in pairs each time from the same mouse embryo, therefore any mouse-related covariates are automatically controlled for.Randomization was not relevant to other experiments.

Blinding
Blinding was not relevant to our study as we did not compare cases and controls.Investigators did not have access to identifying information.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies.Here, indicate whether each material, system or method listed is relevant to your study.The cranial vault surface was extracted from the outer head surface using a cranial vault surface atlas and non-rigid surface registration with the Meshmonk toolbox as described in the methods.Hierarchical datadriven shape segmentation was performed on the cranial vault surface.
Statistic type for inference (See Eklund et al. 2016) Surface-based, not voxel-based, shape variables were tested for association with SNP genotypes using canonical correlation analysis.

Correction
Correction of multivariate shape variables for covariates was performed using partial least squares regression.Correction for multiple testing was performed by an adjusted study-wide p-value threshold, obtained by dividing the genome-wide threshold by the number of effective tests per SNP as estimated through permutation testing.
Models & analysis n/a Involved in the study Functional and/or effective connectivity

Graph analysis
Multivariate modeling or predictive analysis Multivariate modeling and predictive analysis For each of 15 cranial vault segments separately, the set of 3D surface vertices in the segment were subjected to a GPA.A shape-space for each segment was built by conducting PCA on the pooled x, y, and z coordinates of each vertex within the segment and parallel analysis was subsequently used to retain the major axes of shape variation.Canonical correlation analysis was used to test associations between SNP genotypes and the multivariate shape variables.
If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.-pregnant female CD1 mice of 6-8 weeks old were purchased from Charles River Laboratory.The E15.5 embryos were collected via C-section after CO2 euthanasia of the pregnant dam.The mice were housed under standard conditions in the University of Pittsburgh Division of Laboratory Services vivarium. Timed